A Comparative Analysis of Learning Techniques for Cancer Risk Prediction based on Medical Textual Records

نویسندگان

  • Carolina Fócil Arias
  • Grigori Sidorov
  • Alexander F. Gelbukh
  • Miguel A. Sánchez-Pérez
چکیده

In this paper, we compare the performance of a variety of machine learning algorithms, including supervised Naïve Bayes, J48, SVM, Random Tree, Random Forest, and non-supervised KNN for determining the type of cancer a patient is su ering using medical textual records. We train these classi ers on di erent sets of features such as unigrams and bigrams of words, character n-grams using tf-idf weighting scheme and binary feature representation. We evaluated performance of the classi ers in terms of accuracy, precision, recall, and F-measure. The obtained results show that Naïve Bayes and SVM achieve the best performance in this task.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Applying Two Computational Classification Methods to Predict the Risk of Breast Cancer: A Comparative Study

Introduction: Lack of a proper method for early detection and diagnostic errors in medicine are some fundamental problems in treating cancer. Data analysis techniques may significantly help early diagnosis. The current study aimed at applying and evaluating neural networks and decision tree algorithm on breast cancer patients’ data for early cancer prediction. Methods: In the current stu...

متن کامل

Applying Two Computational Classification Methods to Predict the Risk of Breast Cancer: A Comparative Study

Introduction: Lack of a proper method for early detection and diagnostic errors in medicine are some fundamental problems in treating cancer. Data analysis techniques may significantly help early diagnosis. The current study aimed at applying and evaluating neural networks and decision tree algorithm on breast cancer patients’ data for early cancer prediction. Methods: In the current stu...

متن کامل

A Comparative Analysis of the Effect of Visual and Textual Information on the Health Information Perception of High School Girl Students in Tehran

Purpose: Information and information sources can be divided into three broad categories according to their nature or type: textual information (book, journal article, conference paper, dissertation, newspaper, etc.), visual information (infographic, photo, Cartoons, films, etc.) and audiovisual information. The purpose of this study is to determine the effect of reading textual information in c...

متن کامل

Machine learning algorithms in air quality modeling

Modern studies in the field of environment science and engineering show that deterministic models struggle to capture the relationship between the concentration of atmospheric pollutants and their emission sources. The recent advances in statistical modeling based on machine learning approaches have emerged as solution to tackle these issues. It is a fact that, input variable type largely affec...

متن کامل

The Effect of Visual Representation, Textual Representation, and Glossing on Second Language Vocabulary Learning

In this study, the researcher chose three different vocabulary techniques (Visual Representation, Textual Enhancement, and Glossing) and compared them with traditional method of teaching vocabulary. 80 advanced EFL Learners were assigned as four intact groups (three experimental and one control group) through using a proficiency test and a vocabulary test as a pre-test. In the visual group, stu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Research in Computing Science

دوره 130  شماره 

صفحات  -

تاریخ انتشار 2016